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ABSTRACT 



. Context. Precision measurements of the anisotropy of the cosmic microwave background (CMB) are able to detect low-level non- 

Gaussian features caused by either topological defects or the inflation process. These measurements are becoming feasable with the 
development of large arrays of ultra-sensitive bolometric detectors and their use in balloon-borne or satellite missions. However, the 
space environment includes a population of cosmic rays (CRs), which produce spurious spikes in bolometric signals. 
Aims. We analyze the effect of CRs on the measurement of CMB anisotropy maps and the estimate of cosmological non-Gaussianity 
and angular power spectra of the CMB. 

Methods. Using accurate simulations of noise and CR events in bolometric detectors, and de-spiking techniques, we produce simu- 
lated measured maps and analyze the Gaussianity and power spectrum of the maps for different levels and rates of CR events. 
^ . Results. We find that a de-spiking technique based on outlier removal in the detector signals contributing to the same sky pixel is 

■ effective in removing CR events larger than the noise. However, low level events hidden in the noise produce a positive shift of the 

' average power signal measured by a bolometer, and increase its variance. If the number of hits per pixel is large enough, the data dis- 

I tribution for each sky pixel is approximately Gaussian, but the skewness and the kurtosis of the temperatures of the pixels indicate the 

presence of some low-level non-Gaussianity. The standard noise estimation pipeline produces a positive bias in the power spectrum 
at high multipoles. 

CO ' Conclusions. In the case of a typical balloon-borne survey, the CR-induced non-Gaussianity will be marginally detectable in the mem- 

^ ' brane bolometer channels, but be negligible in the spider-web bolometer channels. In experiments with detector sensitivity better than 

100 fiK/ y/JIz, in an environment less favorable than the earth stratosphere, the CR-induced non-Gaussianity is likely to significantly 
^— ' ' aff^ect the results. 

^S) ' Key words. Cosmic Background Radiation - Cosmic rays - Instrumentation: detectors - Methods: data analysis 

o 

, 1. Introduction Arrays of microwave detectors feature high mapping speed 

T— I . and are now starting to operate at the focus of large telescopes 

^ Information about the early Universe is encoded in the primary ^ ^ g^ygr^ ^1 2009, Siringo et al. 2009, Carlstrom et 

amsotropy of the CMB . Whde the Gaussian fluctuations ex- al. [20091 Wilson et al.^OOS, Swetz et al.,2008Jor in CMB po- 

pected in the adiabatic inflationary scenario beautifullyfit the larimeters (see e.g. Hinderks et al. |2009l Yoon et al. [20061 Kuo 

_ available power^ctrum data (see e.g. Nolta et al. [2009[ and ^1. '2006', Samdeben et al., [2007ll. Transition edge supercon- 

d Komatsu et al. [MS, non-Gaussian signals are also expected at ^^j^jj^g ^^j^^^ bolometric detectors, involving only litographic 

a lower level in the maps, due to non-lineanties in the inflation fabrication techniques, are easier to replicate in large arrays, and 

potential (see e.g. Verde et al. 2000) and/or the presence of topo- extremely sensitive 
logical defects (see e.g. Kaiser & Stebbins |1984| l. In addition, 

non-Gaussian secondary anisotropics are imprinted in the post- ^o fully exploit their sensitivity, tiie radiative background has 

recombination universe because of the interaction of CMB pho- ^e minimized. In this sense, the optimal solution is to use 

tons with the large-scale sti-uctures present in the Universe, and bolometric detectors in space, possibly feeding them by means 

the emission of Galactic and exti-agalactic sources. The study °f ^ cryogenic telescope or optical system. The bolometer^f 

of primordial non-Gaussianity can in principle allow to confirm instrument on the Planck mission (Holmes et al. [20081) 

and select an inflation model, but the signal to be detected is very ^o^^ *e PACSand SPIRE instruments on the Herschel 

small. For this reason, it is essential to ensure that the measure- mission (Billot et al. [20091 Schultz et al. [200i represent a first 

ment system does not introduce non-Gaussian features into the ^"-^P ^" direction. 

data. Being extremely sensitive to any form of energy deposited 

on their absorber and thermistors, bolometers are also sensitive 

Send offprint requests to: silvia.masi@romal.infn.it to cosmic rays. The energy deposited by a single MeV proton in 



X 



1 



Masi et al. : Cosmic rays events in bolometric CMB observations 



a cryogenic bolometer is much higher than the typical level of 
the noise (see e.g. Caserta et al. '1990): if an amount of energy 
AE is deposited in a bolometer in a time much shorter than the 
bolometer time constant, the temperature rise of the bolometer 
will be Ar = AE/C, where C is the heat capacity of the detector 
This peak level will be achieved sooner than one time constant; 
the subsequent decay, instead, will be regulated by the time con- 
stant. The typical noise of a bolometer corresponds to physical 
temperature fluctuations much lower than AE/C, so most CR- 
induced spikes will be evident in the bolometer time-ordered- 
data. 

To allow the operation of bolometers in unprotected envi- 
ronments (such as sub-orbital or orbital space instruments), spe- 
cial low cross-section bolometers have been developed (spider- 
webs: Mauskopf et al. '1997', wire-grids: Jones et al. 120031 ). An 
additional benefit of these low-cross-section detectors is the re- 
duced heat capacity with respect to solid-absorber or membrane- 
absorber ones, resulting in a shorter time constant. 

In a polar stratospheric balloon flight, the rate of cosmic- 
ray events in low cross-section (spider-web) bolometers cooled 
to 0.3K is on the order of 0.1 Hz (Masi et al. 2006), which is 
about 20 times less than the rate of events for solid/membrane- 
absorber bolometers at the same temperature in the same en- 
vironment. The events are produced either by primary CR in- 
teracting with the detectors, or by secondary particles, including 
showers of electrons and bremmstrahlung -produced gammas, re- 
sulting from the interactions of primary cosmic rays with the 
metal surrounding the bolometric detector Lower temperature 
detectors of basically the same geometry are affected by lower 
noise, such that the rate of CR events above the noise level is 
even higher; this effect is in part mitigated by their more rapid 
response. Rates between 0.02 Hz and 0.3 Hz have been mea- 
sured for spider-web bolometers cooled to 0. IK in similar strato- 
spheric conditions (see Macias-Perez et al. 120071 ). The promis- 
ing kinetic inductance detectors (KIDs), currently under devel- 
opment, are also sensitive to CRs, at least if the substrate is solid 
Si or sapphire. Anyway, their very fast response reduces signif- 
icantly the fraction of contaminated data. Moreover, their sen- 
sitivity to CRs can be strongly reduced by depositing the KID 
resonator on a membrane. Coherent radiometers, using macro- 
scopic thermally stable components, are much less prone to CR 
glitches. 

Balloon-borne missions, using bolometer arrays, currently 
under development, include among others the EBEX CMB po- 
larimeter (Oxley et al. 2004), the SPIDER experiment to detect 
CMB polarization at larger angular scales (Crill et al.. 120081) . the 
OLIMPO telescope, mainly devoted to high frequency observa- 
tions of the Sunyaev-Zeldovich effect (Nati et al., 2007), and 
the BLAST/BLASTPOL telescope devoted to far infr ared cos- 
mological and Galactic studies (Pascale et al., 120081 Marsden 
et al.. 120081 ) . Given the high mapping speed of large arrays of 
bolometers and the efficient and relatively cheap access to space 
provided by stratospheric balloons, we expect these efforts to 
continue in the future. It is thus relevant to focus on the strato- 
spheric case. 

The data from a bolometric detector can be cleaned by de- 
tecting the spikes and flagging the corresponding section of data 
as unusable for additional analysis. For filtering purposes, the 
masked data can be filled with constrained realizations of noise 
with the same properties as the surrounding data (see e.g. Masi 
et al. 120061 ). However, in the absence of independent cosmic rays 
monitors, a number of low level events will remain unidentified, 
hidden in the noise. If the bolometer is non-ideal, and features 
an additional long time constant, the effect of CR can be an in- 



crease in 1/f noise. In this paper, we focus on the undetected 
cosmic rays hits and study how they can affect the analysis of 
data aimed at measuring CMB anisotropy and non-Gaussianity. 

2. Simulation of bolometer time-ordered data 

To study the effect of CR events, we simulated first time-streams 
of bolometer data in the absence of sky signals, i.e., we added 
CR events to a Gaussian realization of noise G(t) with a given 
standard deviation cr and null average. The time-ordered samples 
of bolometer noise are built as a Gaussian realization, filtered 
with a first-order low-pass filter simulating the thermal response 
of the detector. 

We considered two reference cases with different bolometer 
time constants and CR rates: case Ri has a rate of 2 Hz for a slow, 
membrane absorber bolometer at 0.3K (t - IQms), and case R2 
is a low-cross-section bolometer (spider-web) with a rate of 0. 1 
Hz and a time constant of r = IQms. 

The waiting times for CR events follow an exponential dis- 
tribution of rate R 

P{t) ^ Re '^' ,t > 0. (1) 

The amplitudes A of the pulses induced by CRs are assumed to 
follow an exponential distribution with average (A). The exact 
shape of this distribution for a given experiment depends on the 
spectra of primary cosmic rays in the environment of operation, 
on the distribution and characteristics of the material surround- 
ing the bolometer, on the shape of the sensitive element of the 
bolometer itself, and on the angular distribution of the incom- 
ing cosmic rays in the restframe of the bolometer sensor. The 
important parameter, however, is the fraction of events that have 
amplitudes lower than say 3cr: these are likely to remain hidden 
in the noise. In the case of the exponential distribution that we 
selected, this fraction is 1 - exp(-'icrl{A)), which is ~ 26% if 
<A)/cr = 10. 

The shape of each pulse is modeled as a sudden level jump 
of amplitude A, followed by an exponential decay with a time 
constant t : 

Dit) = Ae-'^\ (2) 

For 10 time constants after the CR event, the simulated data are 
obtained as S it) - D{t) + G(t), whereas elsewhere the simulated 
data are simply G{t). In our initial tests, we do not include sky 
signals in the simulation. 

We simulated chunks of 10 hours of data, to obtain a suffi- 
cient number of events: we have an average of 72 000 (3600) 
events per chunk in case Ri {R2)- Our full set of simulations in- 
cludes 10 000 simulated chunks. 

In Figs. [1] and |2] we show sample subsections of simulated 
data. Large CR spikes are evident, but smaller ones remain hid- 
den in the noise. 

3. Despiking the time-ordered data 

Evident CR events (with amplitudes larger than say 4cr) can be 
easily identified and removed. To remove smaller CR spikes, one 
can take advantage of the peculiar shape of these signals. The 
exact shape depends on the bolometer configuration (single or 
double time constant, or even more complex response) and on 
the details of the front-end electronics, amplifying the nV-level 
bolometer signals. In our simulations, we used a simple expo- 
nential decay, but it would not be difficult to extend our analysis 
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Fig. 1. Top line: sample subsection of simulated bolometer data 
S (t) (converted in i-iKcmb) resulting from the sum of spikes from 
CR events D(t) (middle top line, offset -40 mK for clarity of vi- 
sualization) and bolometer noise G(f) (middle bottom line, off- 
set -50 mK ). In this simulation, the signal is sampled at 200Hz, 
o- = 1.4 mK, r = 0.07,?, R = 2.0Hz, and <A) = 10 mK (case 
^i). Comparing the top and middle lines, it is evident that many 
CR events remain hidden in the noise and are very difficult to 
identify. In the bottom line (off'set -60 mK), we plot the data de- 
spiked following the pixel-based procedure described in the text. 
The missing data have been flagged as unusable because they are 
within 5 time constants of a detected spike. 




-8x10^1 ^ ^ ^ \ ^ ^ ^ \ ^ ^ ^ \ ^ ^ ^ \ ^ ^ , 

1000 1020 1040 1060 1080 1100 

Fig, 2, Same as Fig.[T] with t = Q.Ols, R = Q.lHz, and <A> = 10 
mK (case R2)- 



to more complex shapes, once the impulsive response of the sys- 
tem is known from calibrations. 

In the simulations described in this section, we did not in- 
clude any sky signal. With a typical 200 Hz sampling rate and 
lOOfiK/ yfHz NET, the standard deviation in the bolometer noise 
is on the order of 1400 fiK, while the standard deviation in the 
CMB anisotropy is on the order of 100 fiK. It is thus very un- 
likely that CMB signals can affect the performance of any de- 
spiking procedure. Diffiise Galactic emission can reach levels 
on the order of or higher than 1000 i-iK only in the Galactic 
plane, which is not useful for CMB studies. Point sources are 
also masked in sensitive CMB searches. 

A standard method for identifying events embedded in noise 
is to search for correlations between the noisy time ordered data 
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Table 1. Fraction of true detections (/,) and false detections 
(//) versus threshold level T for the convolution-based despik- 
ing method, for the two reference cases (Ri and R2) (see text). 
Here (A) = lOcr 



and an event template. To estimate the coiTelation, we computed 
the normalized convolution 



C(t) 



^ S {t + u)D{u)du 
J^*^^ D{u)du 



(3) 



In the absence of noise, this correlation is positive for all times 
t where a CR event contaminates the signal. Noise induces fluc- 
tuations in C(t) that, however, are smaller than the fluctuations 
in S it), so there is some advantage in using this estimator. We 
defined a positive threshold T and assumed that all the samples 
with C(f) > r are contaminated by CR events. To analyze the 
efficiency and accuracy of this CR detection method, we con- 
sidered the contaminated samples as a function of the threshold 
level r, and computed the fractions /, and /y . The parameter/, is 
the fraction of events that have C(f) > T and D{t) > D,ni„, where 
D,„i„ = IfiK: these are true detections of CR events. In contrast 
// is the fraction of samples where C(f) > T while D(t) < D„,i„: 
these are false detections caused by noise mimicking CR events. 
For an average amplitude of the CR events equal to 7 times the 
rms of the noise «A) - lOcr), we found that 50% (0.4%) of the 
samples are contaminated in case R\ {Ri)- 

Owing to the noise, we found that the convolution-based 
method fails to identify a number of contaminated samples, and 
introduces a large number of false detections (see Table [1] for 
(A) = lOcr). To avoid a large number of false detections, we 
found that one has to raise the threshold level, although many 
CR events are then missed. The situation is even worse if the 
ratio (A)/cr is lowered. 



4. Pixel-space despiking 

In addition to the problems listed in the previous section, remov- 
ing CR pulses directly from the time-ordered data is inadvisable, 
because of the risk of removing true sky signals (for example, 
fast signals produced by scans over point sources, such as plan- 
ets used as calibrators). 

A more effective strategy is to analyze the set of data sam- 
ples contributing to the same sky pixel: for all of these samples, 
the sky signal is the same, so outliers must be caused by either 
detector noise or CR events. This is true in the absence of point- 
ing errors. In areas with steep brightness gradients (for exam- 
ple near the Galactic plane), the combination of pixelization and 
pointing errors can lead to spurious glitches. We neglect these 
effects here, since they are not relevant to CMB studies. 

We used a simple iterative procedure to remove outliers. For 
a given pixel p, the average {S)p and the standard deviation cr^, 
in all the contributing time-ordered signals Sk were computed. 
For sample k, if the difference Sk - {S)p was larger than Scr^, 
the sample was classified as an outlier, and removed. Detection 
of outliers was therefore performed in pixel space, but outliers 
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Fig. 3. Power spectrum of the data for case Ri (top line), of the 
despiked data, and of the original noise-only data (middle lines). 
The despiking procedure does not introduce spectral features in 
the data, as is evident from the difference between the despiked 
spectrum and the original noise-only spectrum (bottom line). 

were removed in the time domain. New values of both the aver- 
age and variance were computed from the remaining samples of 
the set, and new outliers were identified and removed. This pro- 
cedure was repeated until the average and the variance no longer 
changed (no or few new outliers are found). 

In general, only a few iterations are needed to achieve con- 
vergence. We found that after 6 iterations the efficiency of this 
removal method is similar to that of the convolution-based one, 
yet with this method we do not remove true sky signals. 

The despiked timelines are very similar to the noise-only 
time-lines, and their spectrum (in frequency domain) closely re- 
sembles the noise spectrum, as shown in Fig. [3] for our worst- 
case Ri. 

The average in each pixel converges to a positive value, while 
according to detector noise only it would be very close to zero. 
This is due to CR hits of amplitude smaller than the noise: these 
remain undetected and produce a positive bias. They also in- 
crease the standard deviation in the data contributing to the same 
pixel. 

For this reason, the 1 -point distribution of temperatures in 
pixels obtained after the averaging and the outlier-removal pro- 
cesses will be shifted towards positive values, and be broader 
than expected from instrument noise only. 

After a CR event, the data are contaminated for a certain 
amount of time, depending on both the spike amplitude and the 
time constants of the system. Some of these data might be re- 
covered by removing a best-fit spike profile from the timeline. 
However, due to noise, the fit will not be perfect. We prefer to be 
conservative, and flag as unusable all the data after a spike, for a 
period of 5 time constants. The fraction of flagged data is plotted 
in Fig.|4]for different CR rates and bolometer time constants. The 
obvious effect of CR events is thus a significant decrease in the 
redundancy of the maps. Only the use of fast detectors mitigates 
this problem. In the following, we focus on the second effect, i.e. 
the introduction of bias and non-Gaussian features in the maps. 

We study two different cases. The first case is a targeted ob- 
servation of a small map (about 1.5" x 1.5" in size, with a 4' 
FWHM resolution) for a relatively short time (10 hours of inte- 
gration for a single detector), such as the maps observed by the 
OLIMPO experiment to measure the Sunyaev-Zeldovich effect 
in well known clusters (Nati et al. 120071) . The second is the ob- 
servation of a large (about 10° x 10° in size) deep survey of a 
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Fig. 4. Fraction of data flagged as contaminated by CR events, 
versus average CR rate, in a bolometric experiment. A detector 
noise of lOOju/:/ V^and an average amplitude of CR events of 
10 mK have been assumed. Squares refer to fast detectors (time 
constant of 10 ms), stars refer to slow detectors (time constant of 
70 ms). A pixel-based de-spiking algorithm iteratively clipping 
all the data at more than 3cr from the pixel average has been 
used. All data within 5 time constants of a CR event have been 
flagged. 



clean region, with 7 days of integration. This is similar to the 
blind deep survey of CMB and undetected SZ clusters carried 
out by OLIMPO. We assume the same angular resolution for 
this survey as well. 

In Fig.|5] we plot the 1 -point distributions versus the iteration 
number in the case of observations of 4320 independent pixels 
(small map case), in our two reference cases, for a total integra- 
tion time on the map of 10 hours. The results for the moments of 
these distributions are reported in Table |2] 

It is clear that the outlier removal process is effective, since 
the 1 -point distribution approaches a Gaussian distribution after 
a few iterations. The skewness and kurtosis of the 1-point distri- 
bution return to values consistent with zero, after the despiking 
process. This means that the number of low level spikes aver- 
aged in each pixel is large enough to produce a more Gaussian 
distribution, at least at the level of sensitivity considered here. 

However, the positive bias and the increase in the variance in 
the estimated pixel temperatures caused by undetected CR per- 
sist, as is evident from Fig.|5]and Table|2] 

If the average rate of CR hits is not constant during the obser- 
vations of different pixels, this positive bias varies accordingly, 
introducing fake structures in the measured maps. A modulation 
of the CR rate can be introduced by the scanning strategy of 
the instrument, which induces variations in the absorbing mate- 
rial in-between the CR flux and the bolometers, thus producing 
scan-synchronous systematic effects. For example, an asymmet- 
ric satellite spinning in the solar wind could introduce spurious 
dipole or higher order anisotropy (depending on the structure of 
the satellite) aligned to the solar wind direction. It would be very 
useful to include in the instrumental setup an independent CR 
flux monitor, as close as possible to the focal plane bolometers. 
The level of the spurious signal depends on the CR composition, 
flux and spectrum, on the cross-section of the bolometers, and 
on the structure of the instrument; this can be quantified only by 
focusing on the specific case. 
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Fig. 5. Left: One-point distributions of pixel temperatures esti- 
mates {S }p, vs. despiking iteration. In this simulation, there is no 
sky signal, the detector NET is 100 fiK yfs, the time constant is 
70 ms, the sampling rate is 200 Hz (cr = 1 AmK), and the total 
integration time for 4320 pixels is 10 hours, in an environment 
producing an average amplitude of CR events of 10 mK and an 
average rate of 2 Hz (case R\). The distributions are labeled with 
the number of outlier removal iterations. After each iteration, the 
1 -point distribution moves to the left, approaching a Gaussian 
distribution but remaining shifted with respect to the distribu- 
tion of detector noise without CR events (which is the leftmost 
histogram, hatched and labeled "no CR"). Right: Here the time 
constant is 10 ms, in an environment producing an average am- 
plitude of CR events of 10 mK and an average rate of 0.1 Hz 
(case ^2): the different distributions are hardly distinguishable. 



Even if the CR flux is perfectly steady, the increase in the 
variance of the detected data for each pixel can bias the noise 
estimates, which are needed to estimate the power spectra. In 
the standard analysis procedure for CMB maps one estimates 
the noise from the data, assuming Gaussian noise, and then uses 
Gaussian Monte Carlo simulations to assess the bias in the power 
spectrum (see e.g. Hivon et al. 120021 Polenta et al. l2005l l. 

In principle, this can bias the noise estimates, and, at a lower 
level, might affect the non-Gaussianity parameters relevant to 
cosmology. 

These effects are studied in the following paragraphs, where 
we focus on observations of the larger survey, which is most 
useful to minimizing cosmic variance and measuring the power 
spectrum of the CMB and non-Gaussianity parameters. This ~ 
100 square degrees map contains N -%6 070 1.7' pixels. A total 
integration time of 1 week is assumed. 

We define a data timeline contaminated by spikes for the 
cases R\ and R2 described above (70 and 10 ms detector time 
constant, respectively). These are, respectively, a hard case and 
a mild case, from the point of view of CR contamination. We 
thus perform iterative despiking as described above, and analyze 
the resulting large maps. 

5. Tests of Gaussianity 

Primordial CMB anisotropics are known to have Gaussian dis- 
tributions to leading order (see e.g. Komatsu et al. 120031 De 
Troia et al. 2003, Natoli et al., 2009 ). Smafl deviations from 
Gaussianity have however been predicted and, if observed, may 
be used to constrain inflationary models (see Bartolo et al. 120041 
for a review). 

Measurements of this non-Gaussian component are however 
hampered by several sources of instrumental or non instrumental 
{e.g., foreground contamination) systematic effects, which may 
induce spurious non-Gaussian signatures in the data. 
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Fig. 6. Histogrammed probabilities of the KS test, for the Ri 
(top) and R2 (bottom) cases. The hatched histograms refer to the 
reference Gaussian sets. 



In particular, undetected cosmic-ray hits add a non-Guassian 
contribution to detector noise, which is usually assumed to be 
otherwise normally distributed. Furthermore, the despiking pro- 
cedures described above are based on clipping and may also alter 
the statistics of the dataset. 

We first perform a Kolmogorov-Smirnov (KS) test on the 
pixel values for (1) five iteration despiking for the R\ and R2 and 
(2) no CR contamination, the latter to be used as the reference 
purely Gaussian dataset. The KS test (see Press et al. 1 19921 1 is 
a standard test to quantify the probability that a sample can be 
ascribed to a given distribution. It can also be used to reject the 
null hypothesis that two different samples are drawn from the 
same distribution. For each map, we compute the KS measure 

Dote = max 15^(0- F(f)|, 

t 

where t is the map temperature value, F the cumulative of the 
Gaussian distribution function, and the empirical distribu- 
tion function of a simulated noise map. We also define Zobs = 
Dohs Va^- The KS test provides a probability P{Z > Zobs) for a 
given map. In Fig. |6l we plot the KS probabilities obtained for 
the 140 Monte Carlo maps as histograms for the Ri (top) and R2 
cases, along with their respective Gaussian (no CR but rescaled 
effective varianceQ) reference set (hatched). The Ri case exhibits 
a clear deviation from its reference Gaussian counterpart, while 
in the case of R2 this deviation is not at all evident. Even for Ri, 
any single map has (roughly) a 50% probability of being flagged 
as non-Gaussian by our KS analysis. The same considerations 
apply to the observed distribution of the Z coefficients, shown in 
FigE] 

To quantify our detection of non-Gaussianity, we define two 
simple NG estimators: the normalized skewness (S3) and kurto- 
sis (54) of each map. The corresponding histograms are shown 



' The reference Gaussian dataset is different in the two cases because 
the noise realizations are different and their variances are corrected to 
account for the increase in noise caused by despiking; see the next sec- 
tion for further details. 
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Fig. 7. Same as Fig.|6]but for the Zobs KS coefficients. 

in Fig. [8] again for the cases Ri, Ri, and their respective ref- 
erence Gaussian sets (hatched). We note how the R2 case also 
exhibits deviation from Gaussianity in the case of the kurtosis. 
The empirical (histogrammed) distribution functions can them- 
selves be subject to a KS test against their Gaussian reference 
sets: in this case, one simply measures the distance between the 
empirical cumulative distribution functions. This test rejects at 
high significance (> 99.99%) the null hypothesis that the dis- 
tribution of S-i and ^4 is Gaussian for the R\ case. For R2, the 
Gaussianity of 5 4 is also rejected with similar confidence, but 
the null hypothesis is accepted {P > 35%) for the 5 3 case. 

In conclusion, the non-Gaussianity level induced by CR con- 
tamination in the large noise maps is weak and cannot be de- 
tected with high statistical significance even in the Ri case, while 
the R2 maps are, taken one at a time, indistinguishable from 
Gaussian maps, at least for the tests employed here. However, 
the situation changes when a set of maps is considered: our 
Monte Carlo analysis shows that, when a moderate number (140 
in our case) are analyzed jointly, non-Gaussian signatures are 
detected even for /?2- As a consequence, an experiment that uses 
a similar detector but gathers significantly more data than con- 
sidered here, may be prone to CR-induced non-Gaussianity, es- 
pecially when high precision measurements are sought. This is 
especially critical for high sensitivity measurements of the fyi 
parameter that quantifies the level of non-Gaussianity in the pri- 
mordial perturbations (Bartolo et al.|2004). In particular, Planck 
(The Planck Collaboration. 120061 ) is expected to tighten existing 
constraints on f^L by roughly an order of magnitude. We leave 
a detailed study of the level of contamination by residual CR on 
/nl estimates to a future paper. 

6. Biasing of power spectra 

The angular power spectrum of the CMB is a most valuable cos- 
mological observable, which can be used to extract information 
about the underlying physical model. Thus, it is critical to assess 
the effect of contamination from residual, undetected spikes. 

We thus estimate the angular power spectrum using cRO- 
MAster, a pseudo-Cf estimator based on MASTER (Hivon et 
al. 120021 ). which was originally developed for (and applied to) 



Skewness and Kur;osis for R, case 



z 10 




-0.04 -0.02 0.00 0.02 0.04 

Skewness S3 




-0.10 -0.05 0.00 0.05 0.10 

Kurtosis S^ 



Skewness and Kur;osis for Rj case 




-0.10 -0.05 0.00 0.05 0.10 

Kurtosis S^ 



Fig. 8. Empirical observed probabilities for skewness (5 3) and 
kurtosis (5 4), sampled from 140 Monte Carlo noise maps (red). 
Shown are again the Ri (top) and R2 cases. The reference 
Gaussian sets are shown hatched. When comparing to the val- 
ues reported in Table |2] it should be noted that the histograms 
here are computed over a considerably larger number of pixels. 



BOOMERanG-B03 ( Jones et al. l2006l Piacentini et al. l2006l 
Montroy et al. 2006 Masi et al. 2006) and later improved for 
PLANCK data analysis. cROMAster can estimate both auto- and 
cross-power spectra, where the former are more efficient but re- 
quire noise removal and the latter are less efficient but are natu- 
rally unbiased (Polenta et al. 2005 ), therefore residual bias is not 
expected in the pure cross-spectrum case. We verified this asser- 
tion using simulations. The case of auto-spectra is more delicate 
since they require a companion Monte Carlo dataset to estimate 
and remove residual noise. 

To verify the effect of the spikes, we firstly generated a 
Monte Carlo dataset that ignores the presence of residual spikes 
themselves, and only relies on the nominal detector noise level. 
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Fig. 9. Simulated measurement of the power spectrum of CMB 
anisotropy in a bolometric experiment affected by CR hits as in 
case Ri. The noise estimate neglects the presence of CR spikes. 
As a result, the spectrum is heavily biased (black crosses show 
this residual). 



Fig. 11. Same as Fig. |9] Here the noise estimate has been per- 
formed from the dataset using a standard pipeline, i.e. subtract- 
ing the best fit signal contribution from the data timelines. The 
non-Gaussian nature of the residual spikes has been neglected. 
The residual bias in the power spectrum is small, and appears to 
be limited to the highest multipoles. 
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Fig. 10. Same as Fig.|9] Here the noise estimate accounts for CR 
spikes, assuming that their distribution is known a-priori. As a 
result, the spectrum is not biased. 



When using this dataset as a noise estimate, the resulting power 
spectrum is heavily biased, as shown in Fig.|9] 

Thus taking into account the effect of spikes in the Monte 
Carlo dataset is important. This can be done in two ways: the first 
and in principle more precise method is to process the Monte 
Carlo timelines by adding simulated spike signal with the same 
properties of the spikes seen in real data. When this is performed, 
a bias free auto-spectrum can be obtained as shown in Fig.fTOl 

However, this procedure relies on the knowledge of the un- 
derlying spike's rate and distribution. The spike characteristics 
and statistics can in principle be extrapolated from the data them- 
selves, relying on the observed events. This procedure, however, 
significantly complicates the noise estimation and Monte Carlo 
generation pipeline of the experiment. Moreover, this approach 
is prone to biases, since the estimated statistics are derived from 
the most energetic CR events, which are detected above the 
noise, while the low energy ones, which contaminate the data, 
are not detected: their distribution can only be estimated by 
means of analytic extrapolation of the high energy distribution. 
The reliability is thus unknown. The variance produced by the 
extrapolated low-energy part of the distribution can be used to 
guess the importance of the problem. The only way to obtain ro- 



bust estimates is to perform detailed and extensive Monte Carlo 
simulations (using packages such as GEANT-4, see Agostinelli 
et al. 2003 ) of the interaction of the full instrument with the cos- 
mic environment. If the high-energy part of the simulated spec- 
trum fits the data, the low energy part of the simulated spectrum 
can be used to estimate the noise bias from undetected spikes. 
This procedure is certainly complex and computationally very 
intensive. 

We thus verified wether a simpler approach would produce 
sufficiently accurate results for power spectrum estimation. For 
this second scheme, we estimated the noise properties from the 
data that are contaminated by spikes but, in the timeline simula- 
tion, we avoided modeling the spike contribution and despiking 
procedure. In practice, we first measured the noise properties of 
the despiked timeline, after subtracting the signal contribution by 
means of standard iterative techniques (Ferreira and Jaffe 2000]l, 
and used this dataset to estimate the noise power spectrum. Since 
a timeline contaminated by spikes exhibits some level of non- 
Gaussianity, the noise power spectrum does not encode all the 
information about our noise properties. However, we ignored 
this complication, and used the estimated spectrum information 
to generate stationary Gaussian realization of noise. Since the 
spike residual is probably non-Gaussian, this procedure is not 
strictly correct. However, as shown in Fig. (TT] the residual bias 
we obtain is -visually- very small. 

To quantify this statement, we performed the Hausman test 
(Polenta et al. 2005 ) on the band power we obtain from our sim- 
ulations. The Hausman test is a powerful procedure to assess the 
significance of a residual noise bias in the spectrum. It uses both 
cross-spectra and auto-spectra estimates. We restrict ourselves to 
the Ri and R2 cases described above. As in Polenta et al. 2005] 
we define three test statistics to detect a bias in the noise esti- 
mation, si - supi. Bi(r), S2 - sup,. |Bi(r)|, and - B^(r)dr, 
where B^ir) is a random process defined as 

J ILr] 

BL{r)= —J]Hf, re [0,1], (4) 
yL 

where [.] denotes integer part, L is the maximum multipole used 
in the harmonic analysis, and H( is the difference of the cross 
and auto (or in general unbiased versus biased) power spectrum 
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Si 
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96% 
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39% 


Si 


59% 


19 % 
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Si 


48 % 


15 % 


7 % 


S3 


38 % 


8% 


1.4% 



Table 3. Confidence level for bias detection with the Hausman 
test in cases Ri (top three lines) and R2 (bottom three lines), 
with a naively simulated noise estimation pipeline (see text). The 
parameters si, S2, and 53 represent three different bias estimators 
defined in Polenta et al. l2005l (see text). 



estimators normalized by its variance. It can be shown that as 
L ^ 00, Biir) converges to a Brownian motion process, whose 
properties are widely studied and well-known, and therefore can 
be used to test the null bias hypothesis. In this case, rather than 
relying on the asymptotical distribution, we used Monte Carlo 
simulations to draw the empirical distributions of the test statis- 
tics. Our results for the confidence of bias detection with 68%, 
95%, and 99% probability are shown in Table[3] 

We note that, despite being small, the bias can always be 
detected at a high statistical significance. This is a strong indi- 
cation that the noise estimation pipeline for an high precision 
experiment must account for the presence of spikes in a more 
accurate way than the simple procedure set forth above, at least 
if auto spectra are desired. 

While we do not explicitly simulate an experiment capable 
of measuring polarization spectra, it is clear that the same con- 
clusions hold, a fortiori, when linear polarization can be mea- 
sured. On the other hand, a spectral pipeline that relies only on 
cross-spectra is very robust to the effect of spikes. 

Hence, cross-spectra, while being less efficient than auto- 
spectra, are certainly more adequate for an experiment contam- 
inated by cosmic rays, at least as long as correlated events be- 
tween different detectors are excluded. 



7. Conclusions 

Bolometric observations carried out from space are affected by 
cosmic rays. The data must be despiked to be used efficiently 
and avoid serious biases in the results. Pixel-based outliers re- 
moval works well: the power spectra of the despiked timelines 
closely resemble the original Gaussian noise used in the sim- 
ulation. However, low-level events remain hidden in the noise, 
resulting in a positive shift of the average signal measured in 
each pixel, and increasing its variance. 

The maps obtained from despiked data are non-Gaussian. 
The level of non-Gaussianity depends on the rate of the spikes, 
on the deposited energy and on the time constant and noise of the 
detectors. Using the skewness and the kurtosis of the pixel tem- 
peratures as simple non-Gaussianity indicators, we have found 
that, in the case of the 100 square degree survey, the CR-induced 
non-Gaussianity is unlikely to be detected if spider-web bolome- 
ters are used, but would be marginally detectable for slow mem- 
brane detectors. For experiments with longer integration times 
and/or lower noise detectors, the CR-induced non-Gaussianity 
will be significant. This will probably affect the constraints on 
the primordial non-Gaussianity expected by present space-borne 
missions. Detailed studies need to be performed in these cases. 



Using the standard analysis pipeline on the map produced 
by the despiked timeline, we have found that the residual hidden 
events produce a positive bias in the angular power spectrum of 
the map at high multipoles. In the specific example of the 100 
square degree survey, the expected bias is negligible for spider- 
web bolometers, but can be important for slow membrane de- 
tectors. However, since all modern experiments use bolometer 
arrays, cross-spectra can be computed in place of auto-spectra. 
These spectra are virtually unaffected by the CR-induced bias 
problem. The only unavoidable problem however is a significant 
decrease in the effective integration time and redundancy of the 
maps, in the case of slow membrane detectors. 
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Iteration 


average (^K) 


rms (^K) 


skewness 


kurtosis 


no CR 


0.0 ± 0.7 


39.6 ± 0.4 


-0.001 ± 0.038 


0.022 ± 0.078 


U {Ki) 

l(Ri) 
2(Ri) 
3(Ri) 
4(Ri) 
5(Ri) 


y i^jy ± 0^ 

(529 ± 3 ) 
(173.8 ± 1.5) 
(94.3 ± 1.1) 
(81.4 ± 1.1) 
(79.3 ± 1.1) 


(92.3 ±2.1) 
(59.1 ±0.8) 
(56.0 ± 0.6) 
(57.4 ± 0.6) 
(57.9 ± 0.6) 


yxj.Vyj ± U.U/Uy 

(0.231 ± 0.070) 
(0.092 ± 0.042) 
(0.020 ± 0.038) 
(0.014 ± 0.038) 
(0.012 ± 0.038) 


{C\ C\(\ -\- C\ 
(,U.UO ± Kj.Vj) 

(0.11 ±0.15) 
(0.059 ± 0.087) 
(0.029 ± 0.076) 
(0.029 ± 0.076) 
(0.029 ± 0.077 ) 


u (,K2; 
KRj) 

2(R2) 

3(R2) 
4(R2) 
5(R2) 


yVZ.j ± \j. 1 ) 

(1.1 ±0.7) 
(0.8 ± 0.7) 
(0.8 ± 0.7) 
(0.8 ± 0.7) 
(0.8 ± 0.7) 


(A\ Q -1- n 
yH^L.y ± U.J ) 

(40.6 ± 0.4) 
(40.9 ± 0.4) 
(41.0 ±0.5) 
(41.0 ±0.4) 
(41.0 ±0.4) 


(0.001 ± 0.037) 
(0.001 ± 0.037) 
(0.001 ± 0.037) 
(0.001 ± 0.037) 
(0.001 ± 0.037) 


(0.029 ± 0.078) 
(0.029 ± 0.077) 
(0.029 ± 0.078) 
(0.029 ± 0.078) 
(0.029 ± 0.078) 



Table 2. Parameters of the 1 -point distribution of pixel temperature estimates, versus iteration of the pixel-based despiking, for cases and Ri, 
(observations of small maps, same conditions as in Fig.|5]l. Iteration refers to the values without despiking. The first line reports the noise-only 
case (no CRs). The errors describe the dispersion in the results of 3000 simulations. 
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